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A linear-space geometric theory of intersymbol interference is intro- 
duced in this paper. An equivalence between the structure of intersymbol 
interference and a wide-sense stationary discrete random process is 
demonstrated and exploited to demonstrate the equivalence of zero-forcing 
(decision-feedback) equalization to minimum mean-square error linear 
interpolation {prediction) of a random process. This equivalence is used 
to quickly derive the properties of these equalizers and give them additional 
geometric interpretation. Results from prediction theory are used to 
develop practical computational methods of determining the tap-gains of 
the infinite equalizers for both rational and nonrational channel power 
spectra. Finally, the theory of reproducing kernel Hilbert spaces is used 
to develop a theory of equalization for nonstationary channels with non- 
stationary noise. 

I. INTRODUCTION 

The analysis of digital communication systems from a geometrical 
viewpoint — the viewing of waveforms as points in a signal space and 
the identification of cross-correlation with the formation of an inner 
product — is by now well established. To a large extent, this approach 
has been popularized by the book of Wozencraft and Jacobs. 1 However, 
when it comes to analyzing systems with intersymbol interference, 
frequency-domain techniques have almost exclusively been relied upon. 
The purpose of this paper is to consider pulse-amplitude modulation 
(PAM) systems with intersymbol interference from a geometric 
standpoint, and more specifically to develop a geometric theory of 
equalization. 
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Consideration of the geometric structure of intersymbol inter- 
ference leads immediately to the observation of a striking correspond- 
ence to the theory of minimum mean-square error (MMSE) linear 
estimation of a wide-sense stationary discrete-parameter random 
process. The fact that the latter subject is almost exclusively treated 
by geometric methods 2 - 3 is further impetus for this approach to 
equalization. 

The theories of linear zero-forcing equalization and decision-feedback 
equalization are well established. The properties of linear equalization 




Fig. 1 — (a) Communication system model, (b) Matched-filter receiver, (c) Zero- 
forcing equalizer, (d) Decision-feedback equalizer. 
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are summarized by Lucky, et al., 4 while the present state of knowledge 
of decision-feedback equalization is summarized by Monsen 6 and 
Price. 6 The primary analysis tools which have been used are the cal- 
culus of variations in the case of linear equalization and Toeplitz forms 
in the case of decision-feedback equalization. 

In this paper, the geometric approach enables us to treat the two 
types of equalization simultaneously using the same mathematical 
framework, in which the relationship between them becomes very clear 
and many of their known properties are given an additional geometric 
interpretation. Many of the results follow directly from the theory of 
MMSE estimation. In addition to the unification and reinterpretation 
of previously known results, the geometric approach leads to exten- 
sions of the theory in several directions. Among these are the deriva- 
tion of an orthogonal expansion in Section 2.4 which is useful in many 
problems involving intersymbol interference, the development of 
practical iterative techniques for determining equalizer tap-gains 
(the infinite case) in Section 3.4, the extension of the theory of equali- 
zation to nonstationary noise and a time-varying channel in Section 
IV, and numerous results on the minimum distance problem associated 
with the performance analysis of the Viterbi algorithm maximum 
likelihood detector in a companion paper. 7 

This paper together with a companion one 7 expand upon an earlier 
talk. 8 Readers desiring a limited and short treatment of this subject 
may wish to refer there. The geometrical approach to intersymbol 
interference was also employed to a limited extent in the author's 
thesis. 9 

1.1 Problem Statement 

We will consider the detection of a sequence of digital data digits, 
B k , each assuming one of a finite and predetermined number of levels, 
from the reception 

r{t) = £ B k h(t - kT) + n(t) (1) 

as determined from the communication system model of Fig. la. It 
will be assumed initially that n (t) is white Gaussian noise (this assump- 
tion will be relaxed in Section IV). f A simple matched-filter receiver 
for the reception of r(t) is shown in Fig. lb. In the first of two equiv- 
alent formulations of this receiver, the reception is cross-correlated 

The assumption of Gaussian noise is not necessary for the majority of results to 
follow, and in particular those which involve only second-order statistics of the noise. 
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with h(t - kT) and the decision on B k made by applying a series of 
thresholds to the result ; in the second formulation the cross-correlator 
is realized as a filter with impulse response h( — t) (commonly called a 
matched filter) whose output is sampled at t - kT. The matched- 
filter receiver is optimum when there is no intersymbol interference, 
but in the presence of intersymbol interference the matched filter will 
respond to more than a single data digit and the performance of the 
receiver will be degraded. 

When there is intersymbol interference, a common approach is to 
build a linear filter, called a zero-forcing equalizer (ZFE), which re- 
sponds to only a single time-translate of h (t) (this can only be approxi- 
mated in practice). The most common form of this equalizer, shown 
in Fig. lc, is a matched filter followed by transversal filter (MFTF). 
As N —kx> the tap-gains of the transversal filter can be chosen such 
that the threshold input is a function of only a single data digit. It is 
important to note for future reference that the MFTF can also be 
modeled in the manner of Fig. lb as a cross-correlation of r(t) with a 
linear sum of time translates of h(t), 

£ a m h(t - mT). 

m—N 

The decision-feedback equalizer (DFE) embodies a slightly different 
philosophy in which the DFE forward filter is allowed to respond to 
past (but not future) translates at h(t) ; the residual interference from 
past data digits is then subtracted out prior to the decision threshold 
using past decisions. A realization of the DFE using again the MFTF 
approach is shown in Fig. Id. The tap coefficients are now chosen to 
null the response to future data digits; this can be accomplished as 
N-> a. 

The shortcoming of both the ZFE and DFE is that their linear 
filters remove intersymbol interference without regard to the effect 
on the noise; the result is that in eliminating the intersymbol inter- 
ference (or a portion thereof) they necessarily enhance the noise. 1 " It 
seems clear intuitively that since the DFE eliminates interference 
from only future data digits, it has more degrees of freedom than the 
ZFE and should therefore be capable of less noise enhancement. A 
proof that this is always the case has been given by Price; 6 his method 
was to determine an explicit formula for the DFE S/N ratio using 



f In addition, the DFE is susceptible to decision errors. The effect of errors will 
not receive consideration here. 
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Toeplitz form theory and compare it with the known S/N ratio of the 
ZFE. 4 Additional interpretation of this result will be given in Section 
3.1. 

A review of some requisite material on linear spaces and MMSE 
linear estimation is given in Sections 2.1 and 2.2. Readers familiar 
with this material are nevertheless urged to scan these sections for 
notation to be employed in the remainder of the paper. The ZFE and 
DFE are reformulated in Section 2.3. In Section 2.4 the relationship 
between intersymbol interference and MMSE estimation is discussed, 
and a useful orthogonal expansion arising out of this relationship is 
derived in Section 2.5. 

Section III develops a geometric theory of the ZFE and DFE. 
Conditions necessary and sufficient for the existence of these equalizers 
are given in Section 3.1, their performance is discussed in Section 3.2, 
a useful property of the DFE with regard to its output noise sequence 
is interpreted in Section 3.3, methods of calculating the tap-gains are 
derived in Section 3.4, and the relationship between finite and infinite 
transversal filter equalizers receives consideration in Section 3.5. 

Sections II and III are concerned with additive white noise ex- 
clusively. Section IV extends the theory to colored Gaussian noise, 
nonstationary Gaussian noise, and a time-varying channel using the 
theory of reproducing kernel Hilbert spaces (RKHS). 

II. AN EQUIVALENCE TO DISCRETE RANDOM PROCESSES 

The structure of the intersymbol interference in (1) will now be 
shown to have an equivalence to a wide-sense stationary random 
process. The starting point will be a quick review of linear spaces and 
of linear mean-square error (MMSE) estimation of a random process. 

2.1 Hilbert Space Notation 10 

An inner product space £ consists of a linear space together with a 
defined inner product (x, y) between two elements x and y. All spaces 
in this paper are Hilbert spaces, which consist of an inner product 
space satisfying an additional closure property (specifically, the limits 
of Cauchy sequences must be in the space). The inner product induces 
a norm, or "length" of a vector, 

11*11 4 <*,*) (2) 

and the notion of the distance between two vectors, \\x — y\\. The 
geometrical interpretation of these quantities is illustrated in Fig. 2. 
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^ II Ml ^ 

Fig. 2 — Interpretation of inner product, norm, and distance. 

A subspace of £ is any set of vectors which itself constitutes a linear 
space. If Xb, k E I is a countable or finite sequence of vectors, then 
we denote by M(x k , k E I) the closure of the subspace consisting of 
all finite linear combinations of elements of the set {x k , k E 1} and 
call this the subspace spanned by the »*'s. It is convenient to think 
of elerr ™ts of M(x k , k E I) as convergent (possibly) infinite sums 
of the form 



£ aicXk 



even th<" 
in this 
In r 
of som 
element 



h in some obscure cases not all elements can be expressed 

ninimization problems it is desired to find the element 
id subspace M which is closest to a vector y ; the resulting 
called the projection of y on M, is denoted by P(y;M), 
and satisfies the orthogonality property 

(y -P(y;M),x) = (3) 

for all x E M. t The geometric interpretation of (3) is shown in Fig. 3 
for a one-dimensional subspace spanned by x; for this case the pro- 
jection must be a scalar times x and the validity of (3) is apparent. 

2.2 Review of Linear Mean-Square Interpolation and Prediction 2 - 3 
We will now quickly review the theory of linear mean-square 

estimation of a random variable. 

The set of random variables with zero mean and finite variance is a 

linear space, since the sum of any two such random variables itself 

has these properties. This set is also a Hilbert space with inner product 

(X,Y) = E(XY), (4) 



f When, as in (3), a vector is orthogonal to every vector in M, it is said to be 
orthogonal to M. 
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v P(y.M (xl)= „ „ 



Fig. 3 — Projection on subspace spanned by x. 

where E(-) denotes expected value. It is standard to suppress the 
sample space dependence of a random variable as has been done in (4) 
because the geometric properties (inner product and norm) are deter- 
mined by the value of the random variable on the whole sample space ; 
that is, by its statistics in their entirety. 

Consider now the following interpolation problem : Suppose that a 
sequence of zero-mean random variables X k , — a> < A; < *> , with 
finite variances are given and it is desired to estimate X based on the 
observation of X k , k 5* 0. If the estimate is further stipulated to be 
linear, it is the same as requiring that it be an element of M (X h , k 9* 0). 
Suppose that the estimate X is to be chosen in such a way that the 
mean-square error between X and the estimate is minimized : 

. min E(X - X ) 2 . (5) 

From (4) and the previous section, the MMSE linear interpolator is 

X = P[Z , M(X k , k 9* 0)], (6) 

the projection of X on M(X k , k 5^ 0). 

A second estimation problem which mil be of interest is the pre- 
diction of Xo based only on X k , k > (an anticausal prediction). The 
MMSE linear predictor is the projection of X on the subspace 
spanned by X k , k = 1, 2, • • •, denoted by P[Z , M(X k , k > 0)]. 

2.3 Zero-Forcing and Decision-Feedback Equalization 

We are now prepared to restate the problem of determining the ZFE 
and DFE filters in a linear space context. It will be assumed that the 
basic pulse h(t) in (1) has finite energy (i.e., is square integrable), 



r h 2 (t)dt 



< 0° (7) 

The set of waveforms which satisfies (7) is a linear space, which we 
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denote by L 2 . L 2 is also a Hilbert space with inner product 

{x, V) = f x(t)y(t)dt (8) 

J— oo 

for any two L 2 waveforms x(t) and y(t). For the same reason that the 
sample space dependence of a random variable was suppressed in (4), 
the time dependence of the waveforms x(t) and y(t) has been sup- 
pressed on the left side of (8) : it is the entire time waveform which 
determines the geometric properties. 

The class of niters t which will be considered will be limited to those 
which can be modeled as an inner product (or cross-correlation) of the 
reception r(t) with some L 2 waveform. A ZFE is a filter corresponding 
to a waveform g k (t) which does not respond to any translate of h(t) 
except h(t — kT), 

r h(t - mT)g k (t)dt = , m ^ k, (9) 

J— 00 

but does respond to h(t — kT), 

f a h(t - kT)g k (t)dt * , (10) 

in order that there be a signal on which to base the decision. It is 
evident that if g (t) satisfies (9) and (10) for k = 0, then they are also 
satisfied by g k (t) = g (t - kT) for k j* 0. Written in inner product 
notation, (9) and (10) become 

{h k , go) = 0, k * 0, (11) 

(h 0) g ) * 0, (12) 

where we have written h k for h(t - kT). The analogous condition for a 
DFE forward filter is 

(h kl go^ = 0, fc> 0, (13) 

(h , go) * 0. (14) 

The forms of the ZFE and DFE in this symbolic notation are shown 
in Figs. 4a and b. The output of the linear filter is a function of B k 
(a single data digit) for a ZFE and B k - m , m > (all past data digits) 
for a DFE. The tap-gains of the feedback transversal filter storing 
past decisions for the DFE are equal to the responses of g to previous 
pulses, (go, h- m ), m > 1. 

f In the case of the DFE, we refer only to the forward filter. 
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alfy fT^^bolic representations of the two equalizers: (a) zero-forcinc equalizer- 
(b) decision-feedback equalizer. B 4umi * CI • 



2.4 A Congruence Relationship 

Two Hilbert spaces which display an identical geometrical structure 
are said to be congruent 11 or unitarily equivalent. 10 Specifically, in 
order for two Hilbert spaces to be congruent, there must exist between 
them a one-to-one and onto linear mapping which preserves norms 
and inner products. Although the elements of two such spaces may be 
quite different entities, when considered as elements of their respective 
Hilbert spaces they have the same geometrical structure. 

Define the autocorrelation function of the pulse sequence, 



Rk = (h m , h m+k ). 
It follows from the inequality 



(15) 







N 

H a m hk r 

m=0 



N N 

— S 2 <Xma n Rk m -k„ 
m =0 n =0 



that {R k \ is a nonnegative definite function. Therefore, there exists 
a second-order discrete random process {X k \ which has autocorrela- 
tion R k , 

(X m , X m+ k) = E(X m , X m+ k) 

= Rk- (16) 
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For the random process denned in (16), M(h k , k £ I) and M(X k , k 
are congruent through the obvious mapping 



[N 1 N 

£ Ctmhk m = H OLmXl 
m=l J m=l 



/) 



(17) 



which is a unitary linear transformation. To verify this, observe that 
the mapping is linear, preserves norms, 



I / N Ml 2 

\m=l /|| 



N 

£ a m Xk, 

ro=l 



N N 

~ £ £ Otm(X n Rk m -k t 
m =1 n = l 

I I N 



(18) 



and preserves inner products by an equally simple derivation. 

The mapping of (17) is only denned for finite sums. When / is an 
infinite set, can be extended to all of M(h k , k G -0 by taking limits 
in the mean. For any / G M(h k , k E. I) there exists a sequence {f k \, 
each consisting of a finite sum of the form of (17), such that /*—>/. 
Since <f>(f k ) is a Cauchy sequence from (18) ,we define 4>(f) as the 
limit of <ft(fk), which is in M(X k , k G /) by completeness. 

There is an additional congruence which is useful. From the defini- 
tion of R k in (15), we see that 



2-7T J-»/ir 



(19) 



where 



2tt\ 



R(co) = Y, \Hlta -r iiv-fj, 

= T t Rne in » r , 



(20) 



where R (a>) is an equivalent power spectrum of the channel. From (16), 
R(u)/T is the power spectrum of the random process {X k \. Let 
L s (-ir/T, r/T; R) denote the Hilbert space of all complex-valued 
Lebesque measurable functions f(u>) with domain |w| < tt/T which 
satisfy 

||/(«)|p - i- ["* |/(a»)| 2 «(o>)dco < co (21) 

&1T J — t/T 
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with the obvious definition of the inner product. A frequently invoked 
congruence is between M(X h , - oo <k< oo) &nd L 2 {-t/T, tt/T;R). 2 
By implication, L 2 (-t/T, tt/T; R) and M(h k , — » < k < oo) are 
also congruent through the mapping 

/ N \ N 

n L <x m h km ) = Z <x m e- juk -» T (22) 

\m=l / m=l 

as is readily verified. 

In the remainder of this paper, the congruence demonstrated in this 
section mil be exploited to demonstrate that many available results 
on MMSE interpolation and prediction theory are directly applicable 
to the equalization problems posed in Section 2.3. 

2.5 An Orthogonal Expansio?i 

The congruence relation of Section 2.4 will be used in this section to 
establish an orthogonal expansion in M(h k , — °° < k < oo) which 
will be particularly useful in the sequel. 

Define the element 

et = h k - P[h k , M(h m , m > fc)] (23) 

which is the difference between a translate of h(t), h k , and its pro- 
jection on the subspace of translates to its right. It will be shown later 
that this element is of particular significance to the DFE. For the 
moment, however, note that et is equivalent to the MMSE prediction 
error of X k based on X m , m > k, since the projection is the optimum 
linear predictor. It is well known 3 that the successive prediction errors 
of a random process are uncorrelated random variables. The equiv- 
alent statement relating to e£ is that 

(e£, et) = ||e + || 2 5m,n (24) 

and it is an orthogonal sequence. f This is readily demonstrated directly 
by noting that e+ is orthogonal to M(h k , k ^ m), which contains e+ 
for n > m. Hence, (24) follows for n > m and by symmetry for n < m 
also. 

From (24) it follows that as long as 

lko + || > (25) 

the sequence 

w n = e^/||e + ll, - « < n < oo, (26) 

is an orthonormal set in L 2 . The significance of (25) is that the equiv- 

' The norm of ef is independent of k since et is a time translate of e;J". 
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alent random process must not be linearly predictable with vanishing 
mean-square error (in the language of Ref. 3, p. 564, X k must be 
"regular," or "nondeterministic"). 

Expanding h n in a Fourier series in w n , 

h n = u n + v n 



ro=— oo 



Cm — (W n +m, h n ) = (W m , ho^ 

(v n , Wm) = 0, -co<n<<», - oo < m < » , (27) 

where v n is the remainder. Equation (27) can be simplified by observing 
that 

(w m , h ) = 0, m < 0, 

since h £ M(h k , k ^ 0) and w m is orthogonal to M(h k , k ^ m + 1), 
which contains M(/t*, /c ^ 0) when m < 0. In addition, it can be shown 
(Ref. 3, pp. 571-575) that v n = 0, since the spectrum under consider- 
ation here is absolutely continuous. * Thus, (27) reduces to 

00 

fl n = J2 CmV)n+m 
m=0 

C m = (h , W m ). ( 28 ) 

The expansion of (28), which is used in the theory of linear prediction, 2 ' 3 
is similar in spirit to a straightforward Gram-Schmidt orthogonaliza- 
tion process, but is much more useful in that the coefficients of the 
expansion are independent of n. The main shortcoming of the expan- 
sion (28) is requirement (25). 

The formula for c m given in (27) is not very useful in explicitly 
evaluating the coefficients of (28). A more useful method of evaluation 
is to observe that it is a spectral factorization problem. Defining the 
bilateral z-transform* of the autocorrelation, 



we claim that 



R*(z) = £ RmZ™, (29) 



R*{z) = £ c n Z» £ c n Z-«, (30) 

„=0 n=0 



f This is by virtue of the fact that integral (21) is in terms of i?(w)dw; i.e., the 
underlying measure is presumed to be absolutely continuous with respect to Lebesque 
measure. 

* Note that we define the z-transform in positive powers of z. 
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where the c m are given by (28). To show (30), first calculate Rj from 
(15), 

Rj = (^o, hj) 

~ E E c m c n (w m) w i+n ) 

III =0 71=0 

E CnC n+ j, j > (31) 

n =0 x ' 

E c n c n+j , j < o. 

n = -;' 

Similarly, the right side of (30) can be manipulated, 

°° °° 00 00 00 00 

£. £o CnCmZ "~ m = ^ £ c " c n+ m Z m + L E c n c n _ m Z-"», (32) 

n "° m -° m-0 n-0 m =l n m v 7 

and comparing (31) and (32), (30) is established. The representation 
of (30) is not unique. However, Doob (Ref. 3, p. 160) shows that 
the coefficients of (27) uniquely satisfy (30) when the additional 
conditions 

tc n Z«*0, \Z\ < 1, (33) 

£<<- (34) 

are required. + The necessity of (34) is obvious from (27), while the 
reason why (33) is needed is that otherwise (30) could be satisfied 
on the unit circle by another sequence with a larger zeroth term, 
contradicting the fact that 

c o = l|e + ||. (35) 

Equation (35) follows from the observation that M(h k , k ^ ri) = 
M(w k) k^ n) and therefore P[h n ,M{h k , h > n + 1)] - £« =1 Cm w n+m 



or 



e£ = c w n . (36) 



A simple example will serve to illustrate (30). Suppose h(t) has an 
exponential autocorrelation with 

R k = iil*l, < A < 1. (37) 



Of course, condition (25) is also required. 
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Direct calculation of (29) reveals that 

™ = (1 - AZnf- A/Z) (38) 

which is in the form of (30) with 

c n = ^[Y^A*A\ (39) 

The validity of (39) can be demonstrated directly for this simple 
example by noting that 

e+ = h k - Ah k+1 (40) 

(as can be verified by showing that e$ is orthogonal to h m , m ^ k + 1) 
and thus 

h m — Ah m +i 



Wn = 



\\h m — Ah m+ i\ 
h m — Ahm+i 
Vl - A 2 



(41) 



From (28), 



c m = (Ao, w n ) 

= Vl - A 2 A m (42) 

agreeing with (39). 

The procedure for higher-order rational spectra is equally simple. 
From (29) and the fact that R m is real and even (R- m = R m ), it 
follows that 

R*(z)=R*(-Y (43) 

Thus, for every zero a, and pole 6, of R*(z), ar l and b~ l are also a 
zero and a pole respectively. Thus, R*(z) can be written in the form 

R* (z) = K «=S ) y. (44) 

|a,-|, |M<1 

so that from (30) 

m 

_ Z (1 - <W0 

C(«) = E c„Z» = VI ^ , (45) 

n=0 Ed- fo .-2) 

where (33) has been insured by the choice of zeros in (45). 
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When R*{z) is not rational, a more general method of determining 
the coefficients of (28) is required. For this purpose, we use the equiv- 
alent power spectrum of (20). The first form in (20) is the one re- 
quired for analytically determining C(z), whereas the second form is 
the one which would usually be used in numerical calculations. The 
relationship of R(u>) to R* (z) is, of course, 

R(w) = TR*(e» T ), (46) 

the evaluation of R*(z) on the unit circle. The equivalent of (30) for 
R(u) is 

R(a>) 



L c k e>" kT 

k=0 



(47) 



Intuitively, (47) requires the expansion of Vfl(a>)/!T, with an arbitrary 
phase characteristic, in a complex Fourier series with only positive 

frequencies. Following Doob (Ref. 3, p. 161), expand log -jR(u)/T 
in a Fourier series, 



ilog^ = £r**«'. 



(48) 



This is always possible because, as will be demonstrated later, in order 
for (25) to be satisfied, it is necessary and sufficient that log R(u) be 
integrable. Define 



g(z) = r + 2 £ r k z k (49) 



*-i 



and note that 



Re »(„*") = 1 log ?£> 



We claim that 



2 -6 T (50) 

C(z) = e»<*> (51) 

satisfies (47), since 

\C(e^)\ = explReg (e** 1 )] = ^S. 

Equation (33) is also satisfied since g(z) is analytic lor \z\ < 1. 

Equation (51) is an analytic solution to the problem initially posed, 
but a practical means of applying it numerically is required. It is 
shown in Appendix A that the Fourier coefficients of (48) can be 
calculated efficiently and accurately using the fast Fourier transform 
(FFT) algorithm. The second difficulty is in determining C(z) from 
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g(z) in (51). This is easily resolved by noting that 

m ^ 
1 d 



1 d m nl v 

Cm = , J-^C(Z) 

m\dz m 



= 2^!5-"^L mfel (52) 



Co = e ro 

and applying Leibniz's differentiation rule 

d n " / ?i \ d n ~ m u dPv 

dz~ n UV ~ „ho \ m ) dz n ~ m dz m 
to the product 

dz" {Z) dz»-i\ e dz ) 

»-i / n - 1 \ d n - m g(z) d n C{z) 
,ho \ m ) dz n ~ m dz m 
and, setting z = 0, 

c n = - "y, (n - J/Orn.^c^ , n £ 1- (53) 

Equations (52)- (53) give us a practical recursive method of determining 
the coefficients of (28) when the channel spectrum is not rational. 

III. GEOMETRIC THEORY OF THE ZERO-FORCING AND DECISION-FEEDBACK 
EQUALIZERS 

The zero-forcing equalizer (ZFE) and decision-feedback equalizer 
(DFE) have been introduced in Sections 1.1 and 2.3. In this section, 
we will describe fully the characteristics of these equalizers in the 
context of the geometric structure developed in Section II. 

3.1 Conditions for the Existence of the ZFE and DFE 

The existence of a ZFE and DFE will now be related to the inter- 
polation and prediction of the equivalent random process denned in 
Section 2.2. This relationship will then be used to obtain directly the 
known conditions for their existence. 

The first observation is that the subspaces M(h k , k ^ 0) and 
M(X k) k ^ 0) are identical, as are the subspaces M(h k , k > 0) and 
M (X k , k > 0). The element 

e = h - P[/io, M(h k , k ^ 0)] (54) 
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is the same as the interpolation error vector denned in Section 2.2, 
(X — £o), while the prediction error vector is the same as 



e + = h - P[Ao, M(h h , k > 0)]. 



(55) 



These two vectors are likely candidates for a ZFE and a DFE because 
they are orthogonal to the subspaces M(h k , k ^ 0) and M (hk, k > 0) 
respectively [see Section 2.1 and eq. (3)]. Hence, they satisfy (11) 
and (13) respectively. To verify that they are indeed a ZFE and a DFE, 
conditions (12) and (14) must be checked. Noting that e is orthogonal 
to M(hk, k ?± 0), we have 



(e , h ) = (e , h - P[A„, M(h k , k ^ 0)]) 



= eo 



by definition (54). Similarly, it follows that 

(e + , ho) = ||e<f|| 2 . 



(56) 



(57) 



Thus, we see that a necessary and sufficient condition for e (e$~) to be a 
ZFE (DFE) is that ||e || > (||e + ll > 0). By definition, the projection 
of h on a subspace is the element of that subspace which is at a mini- 
mum distance from ho, and hence ||eo|| and \\ej}~\\ are the minimum 
distances between ho and M(hk, k ?± 0) and M(hk, k > 0) respectively. 
Since ||e || can only vanish if h G M{h k , k 9^ 0), and similarly 
for ||e +||, it follows that e (e +) is a ZFE (DFE) if and only if 
h $ M(h k , k t± 0) [ho C M(h k , k > 0)]. Physically, these conditions 
mean that h(t) must not be representable as an infinite weighted sum 
of a subset of its own translates. Geometrically, it is evident in Fig. 5 
that, as long as ||eo|| > (or ||e^"|| > 0), e (or e^") will have a com- 
ponent in the direction of ho and the equalizer will have a response 
to the desired signal. 



P(h . M <h k . k *-0l> 

OR 
P(li„.M(h. .k>0 




Fig. 5 — Geometric interpretation of the zero-forcing equalizer and decision-feed- 
back equalizer. 
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The weighting functions (54)-(55) can, under reasonable conditions/ 
be written in the form of a convergent linear sum of translates of h , 

e = h — £ a k hk (58) 

et = h - L ath k (59) 

for some coefficients af. This demonstrates that these two elements 
are just the matched filter followed by transversal filter (MFTF) 
discussed in Section 1.1. It will be shown in the next section that the 
MFTF has particular significance, in that it maximizes the S/N ratio. 
In general, there will be many ZFE's and DFE's other than (58)-(59). 
An example of a different ZFE is the element 

K - PLK, M{h k} k £0)] 
for any h' such that 

{K, ho) * 
ho C M(h k , k ^ 0). 

An interesting question that arises is, then, whether there ever exists 
a ZFE and DFE when their corresponding MFTF's do not exist. To 
see that the answer is no for the ZFE (the proof for the DFE is 
identical), note that if h Q € M(h k , k 9* 0), then any g orthogonal to 
M(h k , k ^ 0) is also necessarily orthogonal to ho* Thus, we have 
proven the following theorem : 

Theorem 1 : The following jive statements are equivalent: 

1. ho £ M(h k , k * 0) [fco <£ M(h k , k > 0)]. 

2. ||e„|| > [||e + ll > 0]. 

8. There exists a ZFE [DFE~\. 

4. There exists a ZFE [DFE~\ of the form of eg. (54) [eg. (55)~\, the 
MFTF. 

5. The random process defined in (16) cannot be linearly interpolated 
[predicted^] with vanishing mean-sguare error. 

The fifth condition of Theorem 1 follows from our earlier identification 
of e and e^ as the interpolation and prediction errors, respectively, 
of the equivalent random process. This observation also enables us to 
pull from the literature formulas for the norms of e and ef. The follow- 



* This will be discussed fully in Section 3.5. 

* We also make use of the trivial observation that any g„ satisfying (11) is orthog- 
onal to M{h k , k 5*0). 
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ing corollary follows directly from the known formulas for the inter- 
polation and prediction errors of a random process, 23 

»/_.„.*"' <»>*»] (60) 

W\\ 2 = y exp — / log R(u>)du, • (61) 

Corollary 1: A ZFE \_DFE~] exists if and only if #-*(«) [log 72 (o>)] is 
integrable. 

Both conditions relate to the fashion in which R(u) vanishes. In 
particular, both require that R(u)) vanish on at most a set of measure 
zero. The relationship of (60) and (61) will be discussed more fully 
in the sequel. 

It should be noted also that (61) follows directly from the orthog- 
onal expansion of Section 2.5. From (35) we know that ||e<f || 2 equals 
eg, while (52) gives a relation for c . When the Fourier series of (48) 
is inverted and r is substituted into (52), (61) results. 

3.2 Performance of the Equalizers 

It will now be shown that the MFTF among all ZFE's and DFE's 
maximizes the S/N ratio and minimizes the error probability in white 
Gaussian noise. The derivation will be a simple application of the 
Schwarz inequality. 

Assume that the additive noise in (1) is white and Gaussian. Then 
the decision axis which is applied to a threshold is, for the ZFE, 

(go, r) = B (g , h ) + (g , n), (62) 

where (g , n) = n is a Gaussian random variable with mean zero and 
variance 

Enl = ^ \\g \\* (63) 

and No/2 is the two-sided spectral density of the noise. The minimum 
probability of error decision strategy is then to apply (g , r) to a series 
of M — 1 thresholds, with the specific thresholds depending on the 
probability law on B k . For any such law and series of thresholds the 
probability of error will be a monotone decreasing function of the 
S/N ratio, which is proportional to 

S/N » <|^ , (64) 
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since (g Q , n) is a zero-mean Gaussian random variable with variance 
proportional to ||ff || 2 - Noting from (11) that g is orthogonal to 
P[_h , M(h k , k 9± 0)] whenever g is a ZFE, (64) can be rewritten 



S/N 



(go, eo) 5 

INI 2 



^ M 



(65) 



by the Schwarz inequality, with equality if and only if g equals e 
(the MFTF) within a multiplicative, constant. Thus, the MFTF, 
among all ZFE's, maximizes the S/N ratio. By the same method an 
identical result can be demonstrated for the DFE, if it is assumed 
that the decision-feedback mechanism correctly cancels the tails of 
earlier pulses. 

The preceding derivation, which is a generalization of the Schwarz 
inequality derivation of the matched filter, has the geometric interpre- 
tation of Fig. 6. In writing (65), the maximization of (64) is restricted to 
those g which lie in the hyperplane orthogonal to P[_h , M(h k , k ?* 0) ]. 
Since every ZFE is also orthogonal to this vector, it follows that the 
hyperplane so described contains the set of all ZFE's. However, the 
maximization over elements of the hyperplane does not guarantee a 
result which is a ZFE. The vector in the hyperplane which has the 
greatest component in the direction of h per unit length is evidently 
the one which lines up with e , as verified by (65). Fortunately, this 
vector also turns out to be a ZFE, so that the maximization is complete. 

An additional observation relative to (65) is that the maximum 
S/N ratio is proportional to ||e || 2 for the ZFE and ||e + l| 2 for the DFE. 
The maximum S/N ratio is therefore directly proportional to the mean- 
square interpolation and prediction errors of the equivalent random 
process. Thus, the maximum S/N ratios of the ZFE and DFE are 
given by (60) and (61) respectively, while the factor by which the 




k / 0] 



Fig. 6— S/N ratio maximized by the MFTF. 
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^ j DISTANCE 

TO x- v PLANE 



DISTANCE TO x- AXIS 



Fig. 7 — Geometric interpretation of eq. (66). 

S/N ratio is reduced relative to an isolated pulse with matched filter 
detection is obtained by dividing by R , the isolated pulse energy. 

Price 6 derived (61) by a different method and used the geometric 
mean inequality for integrals to show from (60) and (61) that 

Ihll 2 =g IK+II 2 . (66) 

This important result implies that (i) the S/N ratio of the DFE MFTF 
always exceeds that of the ZFE MFTF, + and (ii) a DFE exists when- 
ever a ZFE exists [the contrary is not true, as demonstrated by the 
important example of algebraic zeros in 22(a) 1 ]. Using the geometric 
method we have developed, two interpretations of (66) can be given. 
First, it is intuitively apparent that the mean-square interpolation 
error of a random process will be smaller than the mean-square pre- 
diction error, because an interpolation is based on more information ; 
similarly, there will be some processes for which interpolation, but not 
prediction, with zero mean-square error is possible. Second, since 
M(h k , k ^ 0) contains M(h k , k > 0), the distance between h and 
M(h k , k ^ 0) (equal to ||e || 2 ) must be smaller than the distance be- 
tween h and M(h k} k > 0) (equal to ||e + ll 2 )- This second interpreta- 
tion is a rigorous way of establishing (66) by a method more direct 
than the integral inequality. It has the geometric interpretation of 
Fig. 7, where the distance between a vector h and the larger subspace 
(the x-y plane) is less than between h and the subspace it contains 
(the x axis). 

The performance of the ZFE and DFE can be evaluated for any 
particular channel spectrum using (60)-(61). In particular, (60)-(61) 
can be evaluated in closed form for rational spectra. A different ap- 
proach, which allows us to evaluate the tap-gains of the equalizers 
as well, will be pursued in Section 3.4. 

1 This result neglects the effect of decision errors on the DFE. 
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3.3 On the DFE White Output Noise Property 

As observed by Price, 6 the DFE forward filter is identical to the 
"whitened matched filter" employed by Forney 12 as the first element 
of his maximum likelihood detector. The property of this filter which 
is essential to Forney's application is that the noise sequence at the 
filter output is uncorrelated. As with the other properties of this filter, 
this one has a simple explanation in terms of the relationship to linear 
prediction. 

Identifying et as ef(t — kT), the noise sequence at the DFE for- 
ward filter output is (et, n). Since n(t) is white noise, this sequence 
will be uncorrelated if and only if 

W, e+) = 0, m * n. (67) 

The validity of (67) and an interpretation of this result in terms of 
the uncorrelated nature of the successive prediction errors of a random 
process has already been given in Section 2.5. 

3.4 Determination of Tap-Gains 

In this section, we will use the orthogonal expansion of Section 2.5 
to derive methods of determining the tap-gains of the forward and 
feedback filters of the MFTF DFE. For comparison purposes the 
well-known relation for the tap-gains of the ZFE will also be briefly 
developed. 

If we write the weighting response of the MFTF ZFE as 

00 

aae = 5Z a khk, (° 8 ) 

*— « 

where the tap-gains of the transversal filter are a k , — » < k < », 
condition (11)- (12) becomes 

(eo, h m ) = ||eo|| 2 5m.o 

= ~T.a k R m - k - (69) 

Oo k 

Taking the bilateral 2-transform of (69), 

ao||eo|| 2 = A(z)R*(z), (70) 

where A (z) is the z-transf orm of the tap-gains 

A(j) = Ea/ (71) 
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Thus, from (70), 

A (z\ - a °H r °ll 2 /70N 

This filter is illustrated in Fig. 8a. When hit) is applied to the input 
of a matched filter and the output sampled at a rate of \/T, the output 
has z-transform R*{z). The transversal filter weighting response has a 
z-transform proportional to R*(z)~ l , so that the output is consistent 
with (69). 

The S/N ratio of the ZFE, given by (60), is readily derived from 
(72). Writing the relation for tap-gain zero, 

1 £ A{z) , oolleoll 2 f dz 

and solving for ||e || 2 , we immediately get (60) using (46). 

As an example, for the exponential autocorrelation of (37), (72) 
becomes 



from which we get 



(74) 



i ii2 1 - A * 

ko 2 = - 



1 + A 



a A (75) 

«-' - * - - rq^ 

a k = 0, |fc| > 1, 

a result derived by Tufts 13 by another method. This example points 
out that it is not ever necessary to actually evaluate (60) when the 
channel spectrum is rational, but rather the performance can be 
obtained by equating the zero-order tap-gains of (72) in the manner 
of (73). 

The situation with the DFE is only slightly more complicated. In 
this case the DFE filter is 

00 

Oo + e + = E a k h k , (76) 

fc=0 

where only taps on one side are involved. Substituting from (28) and 
(36), 

00 00 

at CoWo = L a£ E c m w k+m 

k=0 m=Q 

oo m 

= E w m E atc m - k) (77) 

m=0 fc=0 
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TRANSVERSAL 
FILTER 



ao" e oll 26 k,o 



h(t). 



h(-t) 
H"(w) 



kT 



R'(z) 



Co 



C(Vz) 



C C(z) 



MATCHED 
FILTER 



FORWARD TRANSVERSAL 
FILTER 



yzL 


THRESHOLD 























(b) 



FEEDBACK 
FILTER 



Fig. 8— Spectral representations of the MFTF zero-forcing equalizer (a) and 
decision-feed pack equalizer (b). 



and equating coefficients, 

m 

£ a£c m - k = 



k=0 



f aJ~c , 



0, 



m = 



m > 0. 



(78) 



From (78) we get a recursion relation for the tap coefficients which is 
useful for nonrational spectra, 



^ m— 1 

a+ = - - £ atc m - k , 

Co fc=0 



and a z-transform relation which is useful for rational spectra, 



A+( 2 ) = 



gQ + C 

C(*)' 



(79) 



(80) 



where A+(z) is the z-transform of the tap-gains of (76). Performing 
(80) again for the autocorrelation of (37), 

A+GO = a+(l - Az) (gl) 

lk + || 2 = cl = 1 - A 2 

which is consistent with (40) and is larger than ||e || 2 by a factor of 
(1 + A 2 ). As with the ZFE, the performance of the DFE can be 
determined for rational spectra without the explicit evaluation of (61). 
The comparison of (80) with (72) is interesting, in that they are 
identical except for the fact that in (80) C(z) is substituted for R*(z) 
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in (72). The annulus of convergence of A (z) will always include the 
unit circle, since R*(z) converges in an annulus containing the unit 
circle. Similarly, C(z) is analytic and nonzero in a region containing 
the unit disk, and hence A+(z) will have only positive powers of z 
and converge in a region containing the unit disk. Note that these 
properties of A+(z) are critically dependent on (33) being satisfied. 

The spectral factorization method of determining the tap-gains of 
the DFE was given by Monsen 5 for rational spectra. Price 6 gave a 
formula valid for arbitrary spectra, but it is difficult to evaluate 
numerically. Since (79) is valid for arbitrary spectra, the method 
presented here represents a synthesis of the appeal and computational 
simplicity of the spectra factorization method with the generality of 
Price's Toeplitz form result. 

We also need the tap-gains of the feedback filter for the DFE. 
From Fig. 4, the required feedback tap-gains are given by (e + , A_„), 
1 ^ n < w. From (36) and (28), 

&n = ($, h- n ) =C £ C m (Wo, W m -n) 
m=0 

= c c„. (82) 

Thus, the frequency response of the feedback filter is given by 

00 

£ b m z m - c [C(z) - co]. (83) 



m-l 



The z-transform representation of the DFE just derived is illustrated 
in Fig. 8b. When an isolated pulse h (t) is applied to the matched filter, 
the sampled output has z-transform R*(z). The transversal filter 
multiplies by A+(l/z) = c /C(l/z), as can be verified from (76).* The 
z-transform of the forward transversal filter output is c Q C(z) because 
of (30), which verifies the causal response which is characteristic of 
the DFE. The output of the feedback filter of (83) is then subtracted, 
to yield (hopefully) a delta function response eg. The reader can verify 
that when the threshold is replaced by a gain of l/cjj (the noise-free 
case) the response is as represented. 

3.5 Finite Transversal Filter Equalizers 

The previous sections have considered the rather idealized case of 
infinite transversal filter equalizers. Since only finite equalizers can 

f This is because (76) is not in the form of a convolution sum. This distinction was 
not relevant to the ZFE due to the symmetry of that filter. 
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actually be implemented, the important question arises as to when 
and in what sense the infinite equalizer can be approximated by a 

finite one. 

We have already seen in the example of the exponential autocor- 
relation that the infinite equalizer can degenerate into a finite trans- 
versal filter for some channel spectra. This will happen whenever A (z) 
and A + (z) are finite polynomials in z. From (72) and (80) we see that 
this will occur whenever R*(z) is a rational function which has no 
zeros (only poles). When the spectrum is not rational, or is rational 
with zeros, it will be necessary to approximate the infinite MFTF. 

It is straightforward to generalize the results of Sections 3.1 and 
3.2 to subspaces spanned by a finite number of translates of h . In 
particular, if we replace the criteria of (11) and (13) by 

(h k , go) = -N ^k^N, k * (84) 

for the ZFE and 

(h k , go) = l^k^N (85) 

for the DFE, we are left with the consideration of the finite dimen- 
sional subspaces M(h k , -N g k S N, k ^ 0) and M{h k , 1 £ k & N), 
which we will write as M N and M% respectively. Then the MFTF 
equalizers which satisfy (84) and (85) are similar to (54) and (55), 

e (N) = h - P(h , M N ) (86) 

e +(N) 4 A - P(h , Mt). (87) 

It is straightforward to see that Theorem 1 can be replaced by the 
following version : 

Theorem 2: The following jour statements are equivalent: 

1. h £ M N [ho £ M%~\. 

2. \\eo{N)\\ > [||6o+ (ff) I > 0]. 

3. There exists a ZFE [_DFE~\ in the restricted sense of (84) [(55)]. 

4. There exists an MFTF ZFE \_DFE~] in this restricted sense. 

The question of when it can be asserted that ||eo(AO|| > an( * 
l|«rf*(A0ll > deserves consideration. The condition that ho G Mt re- 
quires that coefficients {a m , 1 ^ m g N\ exist which satisfy 

h = £ a m h m . (88) 



This occurrence will be precluded if the set {h m , — =0 < m < °o 



is 
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linearly independent. Similarly, linear independence is sufficient for 
a ZFE to exist in the sense of (84). The following lemma, which is 
proven in Appendix B, establishes sufficient conditions for the linear 
independence of {h m , — °o < m < eo } > 

Lemma 1: The following two conditions are sufficient for the linear in- 
dependence of {h m , — oo < m < oo } : 

1. ||e || > or \\e£\\ > 0. 

2. There exists an interval [a, 6], a < b, such thatR(w) > 0, a> E [a, 6]. 

The first condition of Lemma 1 satisfies our intuition that if an infinite 
MFTF ZFE or DFE exists then the finite MFTF version should also 
exist. The second condition assures us that the finite equalizers also 
exist under much weaker conditions. 

The following theorem establishes a relationship between the finite 
and infinite equalizers, and is proven in Appendix B : 

Theorem 3: As N— > oo, |jcoW|| 2 is monotonically decreasing and 
approaches \\e \\ 2 , and likewise for ef(N). Furthermore, \\e (N) - e f| 2 — *• 
and \\e£(N) - etf"|| a -»0. 

The primary conclusion of Theorem 3 is that the infinite equalizer 
can be approximated with arbitrary accuracy (in the sense of L 2 
convergence) by a finite equalizer. In addition, it asserts that the 
S/N ratio of this finite equalizer is greater than that of the infinite 
equalizer ; however, this desirable property may be entirely or partially 
offset by any residual intersymbol interference. 

Each member of the sequence of equalizers guaranteed by Theorem 
3 has different tap-gains, because the projection on a different sub- 
space is being taken with each N. A more aesthetically pleasing ap- 
proximation results when (58) and (59) are valid, for then 



N 

ho — £ a k h k — e 

k=—N 
X 

ho — L a>k~h k — e£ 



0, (89) 



0, (90) 



by the definition of convergence of the infinite sums in (58)- (59). 
Each succeeding equalizer defined by (89)- (90) is obtained by adding 
an additional tap, without changing the other tap-gains. As observed 
by Doob (Ref. 3, p. 564), a convergent sum of the form of (58)- (59) 
does not always exist ; the following theorem gives sufficient conditions 
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for the validity of (58)- (59) which are generally satisfied in practical 
problems : 

Theorem tf: U there exist constants Ki and K 2 , < Ki ^ K 2) such 
that K\ ^ R{(ji) ^ K 2 , | co | < t/T, then convergent expansions of e and 
e£ of the form of (58)-(59) exist. Furthermore, the coefficients of the 
expansions are unique. 

This theorem is proven in Appendix B. The question of uniqueness 
of the tap-gains of the DFE is one which was not answered by Price. 6 

Finally, the white output noise property of the MFTF DFE also 
extends to a finite MFTF DFE in the following sense : If the reception 
of (1) extends from Ni to N 2 , where N 2 (but not necessarily iVi) is 
finite, then the DFE defined by 

et = h k - P[h k , M(h m , k + 1 £ m £ iV 2 )] 

will have white output noise samples. This fact is easily verified from 
the same containment of subspaces that was used in the proof for the 
infinite case. 

IV. EXTENSION TO NONSTATIONARY NOISE AND CHANNEL 

The previous sections have considered only the case where the 
additive noise is white. The extension to colored Gaussian noise can 
be handled in a straightforward fashion with the addition of a whiten- 
ing filter. In this section we will generalize the ZFE and DFE to the 
case of arbitrary nonstationary second-order Gaussian noise (which 
includes colored Gaussian noise as a special case) using the techniques 
of reproducing kernel Hilbert space (RKHS). 11 Although the cases 
for which the corresponding RKHS can be characterized explicitly 
correspond generally to those cases which can be handled by other 
techniques, the RKHS approach does allow us to treat all cases 
simultaneously and concisely. In addition, it enables us to generalize 
simultaneously to an arbitrary nonstationary channel (to be precise, a 
channel which is changing in time in a deterministic and known fashion) 
with no additional complications. Perhaps the most interesting out- 
come of this effort will be the observation that the DFE white output 
noise property (discussed in Section 3.3) remains valid in this general 
case. The result is an interesting generalization of Forney's whitened 
matched filter. 12 



* Theorem 4 remains valid under the weaker hypothesis that < ess inf R (w) 
and ess sup R (w) < » . 
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To this end, modify (1) to 

r(t) = Z B m h m (t) + n{t), (91) 

m=Ni 

where, as before, Ni and N 2 can be infinite. The noise will be assumed 
to be Gaussian with arbitrary autocorrelation 

K(t, s) = E[n(t)n(n)l (92) 

The subscript m on h m (t) indicates that the received pulses need not 
be translates of the same elementary waveform. The reception will be 
termed channel stationary when 

MO = h(t- mT) 

and noise stationary when 

K(t, s) = K(t - s). 

We denote by L 2 (n) the subspace of the Hilbert space of square 
integrable random variables spanned by n(t), — oo < t < oo. This 
subspace is entirely analogous to M(X k , — oo < k < oo) defined 
earlier, except that the underlying parameter t is continuous. The 
following lemma is applicable : u 

Lemma 2: Let H(K) consist of all functions g(-) of the form 

g(-) = £[>(•)£/] (93) 

for some U E L 2 (n). Then H{K) is a Hilbert space with inner product 

(g>g)H i K ) =E\u\\ (94) 

The mapping \p: L 2 (n)^>H(K) defined by (93) is a congruence 
which maps n(t) into K(-, t). 

The Hilbert space H(K) defined by Lemma 2 is known as the re- 
producing kernel Hilbert space with reproducing kernel K. It is 
straightforward to show from (93) and (94) that H{K) has the 
properties 

K(-,t)eH(K), - oo <t< co, (95) 

(flf(-), K(-,t)) lHK) = g(t), g G H(K). (96) 

It can be shown 11 that for any symmetric positive-definite kernel K 
there exists a unique Hilbert space satisfying (95)- (96). 

The inverse of g ( • ) under ^ is usually given the suggestive notation 

(g,n) Hi K) =+- l {g) (97) 
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even though n & H(K) with probability one and therefore (97) can- 
not be given an interpretation as an inner product. 

It will be assumed that h m (t) G H(K), since otherwise the detec- 
tion problem is singular. tn In nonstationary noise the space H(K) 
takes the place of L 2 in the earlier white noise problem. Accordingly, 
we restrict the class of filters under consideration to H(K) inner 
products with elements of H(K). Thus, a filter can be written in the 
form 

(9,r)m K) = L B m (g,h m ) H (K) + (g,n) H (K), (98) 

m=— Ni 

where the noise term in (98) assumes the special meaning of (97). 
Analogously to (15), we define the pulse autocorrelation 

R(m, n) = (h m ,h n ) H (K). (99) 

When the reception is noise and channel stationary, R{m, n) is a func- 
tion of the difference of its arguments, as in (15). In general, however, 
it is an arbitrary symmetric positive definite function defined for 

In the white noise case, we saw that the subspace of L 2 spanned by 
translates of h(t) was congruent to the subspace of second-order 
random variables spanned by a wide-sense stationary random process. 
In the nonstationary noise case, the subspace of H(K) spanned by 
h m , Ni ^ m ^ iV 2 , is congruent to the subspace of the second-order 
random variables spanned by a possibly nonstationary second-order 
random process. In the white noise case the theory of minimum mean- 
square error estimation of a wide-sense stationary random process was 
relevant ; in the present case the random process becomes nonstation- 
ary. As before, the ZFE and DFE have interpretations as interpolation 
and prediction errors of the corresponding random process with auto- 
correlation R(m, n). However, rather than pursue these correspond- 
ences further (in view of our results for the white noise case they are 
obvious), we will directly pursue the theory of the ZFE and DFE 
for the detection of B m) Ni ^ m ^ N 2 , from r(t) in (91). 



+ A singular detection problem is one in which a decision can be made which is 
correct with probability one. 

* The positive definite property follows from the inequality 



0^ II £ a B ^ 



H(K\ 



= £ £ am a n R(km,k n ). 
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The theory of Section 3.1 remains valid if the subspaces M(h m , m£I) 
are considered as subspaces of H(K) rather than L 2 . + As before, the 
condition which is necessary and sufficient for the existence of a ZFE 
or DFE is that 

h k £ M(h m , m e I). 
The analogs of the MFTF versions of the DFE and ZFE are the ele- 
ments given by (54) and (55), except that now we must work with e k 
and e$ instead of e and ef (e k is no longer necessarily simply a time 
translate of e , etc.). A derivation similar to that given in Section 3.3 
establishes that e k and e£ maximize the S/N ratio as before. In par- 
ticular, when the filter of (91) is restricted to be a ZFE, (91) becomes 

(g, r) H( K) = B k (g, h k ) H{K) + (g, n) HiK) (100) 

and the S/N ratio is proportional to 

S/N „ (JiMhm g < ^ (101) 

\Qt 9)h(.k) 
since the variance of the noise term in (100) is, from (97), 
ff|<0,nW)|« = E\f-i(g)\* 

= (g, g)a(K) 

through the congruence established in Lemma 2. Equation (101) 
demonstrates that the MFTF ZFE maximizes the S/N ratio, and the 
same result follows for the DFE by the same method. 

A general equation can be given for the projection element required 
for the MFTF. This equation is entirely analogous to a result of 
Parzen 11 for stochastic estimation. To this end we require a lemma 
which is a restatement of Lemma 2 : 

Lemma 8: Let H(R) consist of all functions f(m), m G /, of the form 

f(m) = (h m> F) H{K) m 6 / (102) 

for some F G M(h m , m G /). Then H(R) is the RKHS with reproducing 
kernel R(m, n), m,n G /, and has inner product 

</, /)//(«) - (F, F) BlK) . (103) 

The mapping 0: M(h m , m G /) -*H(R) defined by {102) is a con- 
gruence which maps h m into R(-,m). 

f We use / as a set of indices to avoid repeating the equations twice. For the ZFE 
/ = IN i, k - l]C/[fc + 1, ;V 2 ] and for the DFE / = [A + 1, AT,]. For the infinite 
case, iV 2 = - A^i = « . The digit B k is being detected. 
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The reader might find it instructive to verify from (102)- (103) 
that the RKHS properties hold for H(R), 

R(.,n)EH(R), (104) 

(/(•),«(-,»)>*(*) = f(n), (105) 

where /(•) G H(R). 

The problem we want to attack is finding the projection P of some 
vector Q on M (k m , m G I) (later we will let Q = h k ). From (3) we 
have 

(Q - P, h m ) H ( K) = 0, me I (106) 

or 

(P,h m ) H(K) = PQ (m), mE I, (107) 

where 

P Q (m) = {Q,h m ) H(K)> me I. (108) 

In (107), p Q (m) is a known function and P is to be determined. Assum- 
ing for the moment that p Q G H(R), from Lemma 3 we see that p Q 
is the image of P under the congruence <f>, and hence 

P = <t>- l ( PQ ), (109) 

which is the solution we desire. Using the congruence properties of <p, 
the length of Q - P is 

\\Q - r»G»«)lfrw = UGH™ - 2<Q,*- 1 (pq)W) + II^-Kpq)!! 2 ™ 

= ||Q||2r*> " IIpqIIW ( 11Q ) 

Establishing that in fact PQ G H(R) is straightforward. Note that 
pq(™) = (Q» h m ) H t.K) 

= (P,h m ) H{K) , (HI) 

which implies that pq e H(R) by Lemma 3 since P G M(h m , m G -0- 
Replacing Q by A* in (109), we get the desired projection 

Plh h) M(h m , me 1)1 = t-'LRik, •)] (H2) 

The ZFE and DFE are obtained by letting / equal the appropriate 
set. The S/N ratios of the receivers are proportional to, from (101) 
and (110), 

S/N cr \\h k \\ 2 HlK) - \\R(k, -)IIW (113) 

The RKHS approach has reduced the problem to that of finding 
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RKHS inner products. In some cases these inner products can be 
explicitly characterized, while in all others they can be determined 
by convergent iterative techniques. 11 

We can also quickly show that the DFE white output noise property 
discussed in Section 3.3 generalizes. From (98), the noise samples at 
the filter output are 

n k = (ejt, n) H(K) 

= +- l (.et) (114) 

by definition. From (114) and Lemma 2, 

E( nj n k ) = Elrf,-i(e+)f-i(e+)] 



= fe + , ejt) B (K 



= Of 3 * * (115) 

by the same reasoning as before. 

Finally, it is instructive to demonstrate that this RKHS formu- 
lation reduces to the whitening filter approach when the reception is 
noise and channel stationary. Assume that 



K{t ' S) = L C eiult ~'W (*>)&* , 



(116) 



where N(a>) is uniformly bounded and never vanishes. Under these 
conditions we claim that H(K) consists of all integrable g(t) with 
Fourier transforms G(co) which satisfy 

IMHrw " ^ /_" l^(-)l 2 ^^- (117) 

To verify this, properties (95)- (96) must be checked. Equation (95) 
is valid since N(a>) is integrable, while (96) follows from 

(g{-),K(-,t)) inK) - 1 f" G( u )Ze-'^N(< a )2* 1 ^-Td u 

or y-oo N (co) 

= s~ f G(co)e>'<" du 

= g(t), (118) 

where (*) denotes complex conjugation. From (117), the H(K) inner 
product consists of a filter with frequency response N~ l (co) (which is 
the whitening filter) followed by an ordinary L 2 inner product, and is 
therefore consistent with the whitening filter formulation. 
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V. CONCLUSIONS 

This paper has presented a unified and rather thorough treatment 
of the ZFE and DFE. In a companion paper, 7 the geometric model of 
intersymbol interference developed here will be used to study the 
minimum distance problem encountered in the performance analysis 
of the maximum likelihood detector 12 and in evaluating a lower bound 
on the performance of any receiver. 14 It is shown there that a canonical 
relationship exists between the minimum distance and the performance 
and tap-gains of the MFTF DFE. 

No performance example comparing the DFE and ZFE on a channel 
of practical interest has been given in this paper in order that the 
maximum likelihood detector may enter into the comparison. In Ref . 7 
the performance of three receivers is calculated for a channel whose 
loss in dB increases as the square-root of frequency. This channel is 
an excellent model of coaxial cable and some types of wire-pairs. 
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APPENDIX A 

The purpose of this appendix is to derive an approximation to the 
Fourier coefficients of (48) in terms of discrete Fourier transform 
(DFT), which can be efficiently evaluated using the FFT algorithm. 

Define a normalized function 

F(X)=log-^M < 119 ) 

so that 

r n = l f* e-* n2r *F(\)d\- (120) 

2 ]-\ 

Approximating the integral by a summation, 

- 2 —" » % F ( X ° + F - "2 ) e """"' <121> 
where the sum on the right is a discrete Fourier transform. 
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In order to determine the effect of this approximation, substitute 
±F(\) = £ r*e> M ** (122) 

into the approximation equation (121) to yield 



oo I JV-1 

= rn+Ze** lN *>(-l) l r n+lN . 



(123) 



Thus, the approximation of (121) yields the desired Fourier coefficient 
plus the sum of alias terms. N must be larger than the number of co- 
efficients to be evaluated and large enough that the alias terms r n+l2f 
are small. In practice, N ^ 5,000 can be achieved with modest amounts 
of computer time using the FFT algorithm. 

APPENDIX B 

Proofs of Theorems 

Proof of Lemma 1 : Since ||e +|| 2 ^ ||e || 2 it suffices to show that ||e +j| >0 
implies that {h n , - » < m < «, j i s linearly independent set. To this 
end, assume that 

2 

= 0, ki < k 2 < • • • < k N - (124) 



N 

£ <Xmh kn 
ra=l 



To show that <xi = 0, assume to the contrary that ai ^ and note 
that 



= | ai 



m=2 a\ 



^ \*i\*M\\ > 0- (125) 



This contradiction establishes that on = 0. Continuing by induction 
in the same fashion, it can be shown that « M = 0, 1 £ m £ N. 

To show that the second condition of Lemma 1 implies linear 
independence, we use a proof similar to Tuft's. 13 By the congruence of 
(22), (124) is equivalent to 



/r/T I N 
£ a m er*>*»r 
-t/T I m = l 



R{d))dw = , 



which implies that the integrand is zero almost everywhere on [a, 61 
This is impossible unless a m = 0, 1 ^ m ^ N, since otherwise 



m-l 
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has at most a finite number of algebraic zeros on [a, 6] and R(o)) is 
strictly positive. 

Proof of Theorem 3 : We will prove the result for the ZFE ; the proof 
for the DFE is identical. Since for N ^ M 

M(h k , |*| ^ N, k * 0) C M(h k , |*| ^ M, * * 0) C M(**, * ^ 0), 

the inequality 

lleoll ^ \MM)\\ ^ \MN)\\ 

follows. Hence ||e (M)|| 2 must approach a limit, 

lim ||coWI| ^ ||co||. 



N-*° 



Denote by the shortened notation P the projection of * on M(h k , fc^O) 
(so that e = h - P). Since P G M (**, * 5^ 0), there exists a se- 
quence t„ G M(h k , |*| ^ n, * ^ 0) such that 7n -» P and we have 

||*o - Tn|| 2 = ||eo||» + \\P ~ Tn|| 2 . 

For any e > 0, there exists an N(e) such that 

||*o - 7n|| 2 ^ ||e || 2 + e 

for n ^ N(e), and since ||e (n)|| 2 ^ ||*o - 7n|| 2 we have 

lleoll 2 ^ lko(n)|| 2 ^ ||eo|| 2 + e, 

which establishes that ||e (n)|| -> ||e ||. The remainder of the proof 
follows that of the projection theorem. By the parallelogram law, 

\\e (N) - e || 2 = 2||e (A0|| 2 + 2||e || 2 - \\e (N) + e || 2 , 

but defining P(N) = P[*o, M(h k , |*| ^ iV, fc ^ 0)] 

||eoW + e || 2 = ||*o - P(N) + *o - P|| 2 



= 4 



*o — 



P(N) + P 



^ 4||e | 



we have 



\MN) - e„|| 2 g 2[|MW - IM 2 ] -> 0. 
Proof of Theorem 4 : From (22) we have 



£ /3»>**„ 

m = l 



.= A f' ,T 

2ir J-*/t 



N 



ir/T I m-1 



R(io)d<j 
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A standard result of Toeplitz theory asserts that 



T\ m ? x Lft»l-| essinf fl(co) ^ 






m=l 



VII ^ 



N j 

£ |/3 m | 2 ) ess sup R(u)- 



The conclusions of the theorem then follow from Theorem 5 17 18 of 
Ref. 10. 
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